Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 20640 |
| Missing cells | 207 |
| Missing cells (%) | 0.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.6 MiB |
| Average record size in memory | 80.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 1 |
Alerts
longitude is highly correlated with latitude | High correlation |
latitude is highly correlated with longitude | High correlation |
total_rooms is highly correlated with total_bedrooms and 2 other fields | High correlation |
total_bedrooms is highly correlated with total_rooms and 2 other fields | High correlation |
population is highly correlated with total_rooms and 2 other fields | High correlation |
households is highly correlated with total_rooms and 2 other fields | High correlation |
median_income is highly correlated with median_house_value | High correlation |
median_house_value is highly correlated with median_income | High correlation |
longitude is highly correlated with latitude | High correlation |
latitude is highly correlated with longitude | High correlation |
total_rooms is highly correlated with total_bedrooms and 2 other fields | High correlation |
total_bedrooms is highly correlated with total_rooms and 2 other fields | High correlation |
population is highly correlated with total_rooms and 2 other fields | High correlation |
households is highly correlated with total_rooms and 2 other fields | High correlation |
median_income is highly correlated with median_house_value | High correlation |
median_house_value is highly correlated with median_income | High correlation |
longitude is highly correlated with latitude | High correlation |
latitude is highly correlated with longitude | High correlation |
total_rooms is highly correlated with total_bedrooms and 2 other fields | High correlation |
total_bedrooms is highly correlated with total_rooms and 2 other fields | High correlation |
population is highly correlated with total_rooms and 2 other fields | High correlation |
households is highly correlated with total_rooms and 2 other fields | High correlation |
longitude is highly correlated with latitude and 2 other fields | High correlation |
latitude is highly correlated with longitude and 2 other fields | High correlation |
total_rooms is highly correlated with total_bedrooms and 2 other fields | High correlation |
total_bedrooms is highly correlated with total_rooms and 2 other fields | High correlation |
population is highly correlated with total_rooms and 2 other fields | High correlation |
households is highly correlated with total_rooms and 2 other fields | High correlation |
median_income is highly correlated with median_house_value | High correlation |
median_house_value is highly correlated with longitude and 3 other fields | High correlation |
ocean_proximity is highly correlated with longitude and 2 other fields | High correlation |
total_bedrooms has 207 (1.0%) missing values | Missing |
Reproduction
| Analysis started | 2022-03-12 18:39:52.368795 |
|---|---|
| Analysis finished | 2022-03-12 18:40:13.163938 |
| Duration | 20.8 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 844 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -119.5697045 |
| Minimum | -124.35 |
|---|---|
| Maximum | -114.31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 20640 |
| Negative (%) | 100.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | -124.35 |
|---|---|
| 5-th percentile | -122.47 |
| Q1 | -121.8 |
| median | -118.49 |
| Q3 | -118.01 |
| 95-th percentile | -117.08 |
| Maximum | -114.31 |
| Range | 10.04 |
| Interquartile range (IQR) | 3.79 |
Descriptive statistics
| Standard deviation | 2.003531724 |
|---|---|
| Coefficient of variation (CV) | -0.01675618195 |
| Kurtosis | -1.330152366 |
| Mean | -119.5697045 |
| Median Absolute Deviation (MAD) | 1.28 |
| Skewness | -0.297801208 |
| Sum | -2467918.7 |
| Variance | 4.014139367 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -118.31 | 162 | 0.8% |
| -118.3 | 160 | 0.8% |
| -118.29 | 148 | 0.7% |
| -118.27 | 144 | 0.7% |
| -118.32 | 142 | 0.7% |
| -118.28 | 141 | 0.7% |
| -118.35 | 140 | 0.7% |
| -118.36 | 138 | 0.7% |
| -118.19 | 135 | 0.7% |
| -118.25 | 128 | 0.6% |
| Other values (834) | 19202 |
| Value | Count | Frequency (%) |
| -124.35 | 1 | < 0.1% |
| -124.3 | 2 | < 0.1% |
| -124.27 | 1 | < 0.1% |
| -124.26 | 1 | < 0.1% |
| -124.25 | 1 | < 0.1% |
| -124.23 | 3 | |
| -124.22 | 1 | < 0.1% |
| -124.21 | 3 | |
| -124.19 | 4 | |
| -124.18 | 6 |
| Value | Count | Frequency (%) |
| -114.31 | 1 | < 0.1% |
| -114.47 | 1 | < 0.1% |
| -114.49 | 1 | < 0.1% |
| -114.55 | 1 | < 0.1% |
| -114.56 | 1 | < 0.1% |
| -114.57 | 3 | |
| -114.58 | 2 | |
| -114.59 | 2 | |
| -114.6 | 3 | |
| -114.61 | 3 |
| Distinct | 862 |
|---|---|
| Distinct (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.63186143 |
| Minimum | 32.54 |
|---|---|
| Maximum | 41.95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 32.54 |
|---|---|
| 5-th percentile | 32.82 |
| Q1 | 33.93 |
| median | 34.26 |
| Q3 | 37.71 |
| 95-th percentile | 38.96 |
| Maximum | 41.95 |
| Range | 9.41 |
| Interquartile range (IQR) | 3.78 |
Descriptive statistics
| Standard deviation | 2.135952397 |
|---|---|
| Coefficient of variation (CV) | 0.05994501302 |
| Kurtosis | -1.117759781 |
| Mean | 35.63186143 |
| Median Absolute Deviation (MAD) | 1.23 |
| Skewness | 0.4659530037 |
| Sum | 735441.62 |
| Variance | 4.562292644 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 34.06 | 244 | 1.2% |
| 34.05 | 236 | 1.1% |
| 34.08 | 234 | 1.1% |
| 34.07 | 231 | 1.1% |
| 34.04 | 221 | 1.1% |
| 34.09 | 212 | 1.0% |
| 34.02 | 208 | 1.0% |
| 34.1 | 203 | 1.0% |
| 34.03 | 193 | 0.9% |
| 33.93 | 181 | 0.9% |
| Other values (852) | 18477 |
| Value | Count | Frequency (%) |
| 32.54 | 1 | < 0.1% |
| 32.55 | 3 | < 0.1% |
| 32.56 | 10 | < 0.1% |
| 32.57 | 18 | |
| 32.58 | 26 | |
| 32.59 | 11 | |
| 32.6 | 9 | < 0.1% |
| 32.61 | 14 | |
| 32.62 | 13 | |
| 32.63 | 18 |
| Value | Count | Frequency (%) |
| 41.95 | 2 | |
| 41.92 | 1 | < 0.1% |
| 41.88 | 1 | < 0.1% |
| 41.86 | 3 | |
| 41.84 | 1 | < 0.1% |
| 41.82 | 1 | < 0.1% |
| 41.81 | 2 | |
| 41.8 | 3 | |
| 41.79 | 1 | < 0.1% |
| 41.78 | 3 |
housing_median_age
Real number (ℝ≥0)
| Distinct | 52 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.63948643 |
| Minimum | 1 |
|---|---|
| Maximum | 52 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 8 |
| Q1 | 18 |
| median | 29 |
| Q3 | 37 |
| 95-th percentile | 52 |
| Maximum | 52 |
| Range | 51 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 12.58555761 |
|---|---|
| Coefficient of variation (CV) | 0.4394477408 |
| Kurtosis | -0.8006288536 |
| Mean | 28.63948643 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.0603306376 |
| Sum | 591119 |
| Variance | 158.3962604 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 52 | 1273 | 6.2% |
| 36 | 862 | 4.2% |
| 35 | 824 | 4.0% |
| 16 | 771 | 3.7% |
| 17 | 698 | 3.4% |
| 34 | 689 | 3.3% |
| 26 | 619 | 3.0% |
| 33 | 615 | 3.0% |
| 18 | 570 | 2.8% |
| 25 | 566 | 2.7% |
| Other values (42) | 13153 |
| Value | Count | Frequency (%) |
| 1 | 4 | < 0.1% |
| 2 | 58 | 0.3% |
| 3 | 62 | 0.3% |
| 4 | 191 | |
| 5 | 244 | |
| 6 | 160 | |
| 7 | 175 | |
| 8 | 206 | |
| 9 | 205 | |
| 10 | 264 |
| Value | Count | Frequency (%) |
| 52 | 1273 | |
| 51 | 48 | 0.2% |
| 50 | 136 | 0.7% |
| 49 | 134 | 0.6% |
| 48 | 177 | 0.9% |
| 47 | 198 | 1.0% |
| 46 | 245 | 1.2% |
| 45 | 294 | 1.4% |
| 44 | 356 | 1.7% |
| 43 | 353 | 1.7% |
| Distinct | 5926 |
|---|---|
| Distinct (%) | 28.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2635.763081 |
| Minimum | 2 |
|---|---|
| Maximum | 39320 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 620.95 |
| Q1 | 1447.75 |
| median | 2127 |
| Q3 | 3148 |
| 95-th percentile | 6213.2 |
| Maximum | 39320 |
| Range | 39318 |
| Interquartile range (IQR) | 1700.25 |
Descriptive statistics
| Standard deviation | 2181.615252 |
|---|---|
| Coefficient of variation (CV) | 0.8276977802 |
| Kurtosis | 32.630927 |
| Mean | 2635.763081 |
| Median Absolute Deviation (MAD) | 797 |
| Skewness | 4.147343451 |
| Sum | 54402150 |
| Variance | 4759445.106 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1527 | 18 | 0.1% |
| 1613 | 17 | 0.1% |
| 1582 | 17 | 0.1% |
| 2127 | 16 | 0.1% |
| 1703 | 15 | 0.1% |
| 1471 | 15 | 0.1% |
| 2053 | 15 | 0.1% |
| 1722 | 15 | 0.1% |
| 1607 | 15 | 0.1% |
| 1717 | 15 | 0.1% |
| Other values (5916) | 20482 |
| Value | Count | Frequency (%) |
| 2 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 11 | 1 | < 0.1% |
| 12 | 1 | < 0.1% |
| 15 | 2 | |
| 16 | 1 | < 0.1% |
| 18 | 4 | |
| 19 | 2 | |
| 20 | 2 |
| Value | Count | Frequency (%) |
| 39320 | 1 | |
| 37937 | 1 | |
| 32627 | 1 | |
| 32054 | 1 | |
| 30450 | 1 | |
| 30405 | 1 | |
| 30401 | 1 | |
| 28258 | 1 | |
| 27870 | 1 | |
| 27700 | 1 |
total_bedrooms
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONMISSING| Distinct | 1923 |
|---|---|
| Distinct (%) | 9.4% |
| Missing | 207 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 537.8705525 |
| Minimum | 1 |
|---|---|
| Maximum | 6445 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 137 |
| Q1 | 296 |
| median | 435 |
| Q3 | 647 |
| 95-th percentile | 1275.4 |
| Maximum | 6445 |
| Range | 6444 |
| Interquartile range (IQR) | 351 |
Descriptive statistics
| Standard deviation | 421.3850701 |
|---|---|
| Coefficient of variation (CV) | 0.7834321252 |
| Kurtosis | 21.98557506 |
| Mean | 537.8705525 |
| Median Absolute Deviation (MAD) | 162 |
| Skewness | 3.459546332 |
| Sum | 10990309 |
| Variance | 177565.3773 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 280 | 55 | 0.3% |
| 331 | 51 | 0.2% |
| 345 | 50 | 0.2% |
| 393 | 49 | 0.2% |
| 343 | 49 | 0.2% |
| 394 | 48 | 0.2% |
| 328 | 48 | 0.2% |
| 348 | 48 | 0.2% |
| 272 | 47 | 0.2% |
| 309 | 47 | 0.2% |
| Other values (1913) | 19941 | |
| (Missing) | 207 | 1.0% |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 2 | 2 | < 0.1% |
| 3 | 5 | |
| 4 | 7 | |
| 5 | 6 | |
| 6 | 5 | |
| 7 | 6 | |
| 8 | 8 | |
| 9 | 7 | |
| 10 | 8 |
| Value | Count | Frequency (%) |
| 6445 | 1 | |
| 6210 | 1 | |
| 5471 | 1 | |
| 5419 | 1 | |
| 5290 | 1 | |
| 5033 | 1 | |
| 5027 | 1 | |
| 4957 | 1 | |
| 4952 | 1 | |
| 4819 | 1 |
| Distinct | 3888 |
|---|---|
| Distinct (%) | 18.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1425.476744 |
| Minimum | 3 |
|---|---|
| Maximum | 35682 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 348 |
| Q1 | 787 |
| median | 1166 |
| Q3 | 1725 |
| 95-th percentile | 3288 |
| Maximum | 35682 |
| Range | 35679 |
| Interquartile range (IQR) | 938 |
Descriptive statistics
| Standard deviation | 1132.462122 |
|---|---|
| Coefficient of variation (CV) | 0.7944444737 |
| Kurtosis | 73.55311639 |
| Mean | 1425.476744 |
| Median Absolute Deviation (MAD) | 440 |
| Skewness | 4.935858227 |
| Sum | 29421840 |
| Variance | 1282470.457 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 891 | 25 | 0.1% |
| 761 | 24 | 0.1% |
| 1227 | 24 | 0.1% |
| 850 | 24 | 0.1% |
| 1052 | 24 | 0.1% |
| 825 | 23 | 0.1% |
| 999 | 22 | 0.1% |
| 782 | 22 | 0.1% |
| 1005 | 22 | 0.1% |
| 781 | 21 | 0.1% |
| Other values (3878) | 20409 |
| Value | Count | Frequency (%) |
| 3 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 8 | 4 | |
| 9 | 2 | |
| 11 | 1 | < 0.1% |
| 13 | 4 | |
| 14 | 3 | |
| 15 | 2 | |
| 17 | 2 |
| Value | Count | Frequency (%) |
| 35682 | 1 | |
| 28566 | 1 | |
| 16305 | 1 | |
| 16122 | 1 | |
| 15507 | 1 | |
| 15037 | 1 | |
| 13251 | 1 | |
| 12873 | 1 | |
| 12427 | 1 | |
| 12203 | 1 |
| Distinct | 1815 |
|---|---|
| Distinct (%) | 8.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 499.5396802 |
| Minimum | 1 |
|---|---|
| Maximum | 6082 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 125 |
| Q1 | 280 |
| median | 409 |
| Q3 | 605 |
| 95-th percentile | 1162 |
| Maximum | 6082 |
| Range | 6081 |
| Interquartile range (IQR) | 325 |
Descriptive statistics
| Standard deviation | 382.3297528 |
|---|---|
| Coefficient of variation (CV) | 0.7653641301 |
| Kurtosis | 22.05798806 |
| Mean | 499.5396802 |
| Median Absolute Deviation (MAD) | 151 |
| Skewness | 3.410437712 |
| Sum | 10310499 |
| Variance | 146176.0399 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 306 | 57 | 0.3% |
| 386 | 56 | 0.3% |
| 335 | 56 | 0.3% |
| 282 | 55 | 0.3% |
| 429 | 54 | 0.3% |
| 375 | 53 | 0.3% |
| 284 | 51 | 0.2% |
| 297 | 51 | 0.2% |
| 362 | 50 | 0.2% |
| 380 | 50 | 0.2% |
| Other values (1805) | 20107 |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 7 | |
| 6 | 5 | |
| 7 | 10 | |
| 8 | 8 | |
| 9 | 9 | |
| 10 | 7 |
| Value | Count | Frequency (%) |
| 6082 | 1 | |
| 5358 | 1 | |
| 5189 | 1 | |
| 5050 | 1 | |
| 4930 | 1 | |
| 4855 | 1 | |
| 4769 | 1 | |
| 4616 | 1 | |
| 4490 | 1 | |
| 4372 | 1 |
| Distinct | 12928 |
|---|---|
| Distinct (%) | 62.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.870671003 |
| Minimum | 0.4999 |
|---|---|
| Maximum | 15.0001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 0.4999 |
|---|---|
| 5-th percentile | 1.60057 |
| Q1 | 2.5634 |
| median | 3.5348 |
| Q3 | 4.74325 |
| 95-th percentile | 7.300305 |
| Maximum | 15.0001 |
| Range | 14.5002 |
| Interquartile range (IQR) | 2.17985 |
Descriptive statistics
| Standard deviation | 1.899821718 |
|---|---|
| Coefficient of variation (CV) | 0.4908249026 |
| Kurtosis | 4.952524102 |
| Mean | 3.870671003 |
| Median Absolute Deviation (MAD) | 1.0642 |
| Skewness | 1.646656702 |
| Sum | 79890.6495 |
| Variance | 3.60932256 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.125 | 49 | 0.2% |
| 15.0001 | 49 | 0.2% |
| 2.875 | 46 | 0.2% |
| 4.125 | 44 | 0.2% |
| 2.625 | 44 | 0.2% |
| 3.875 | 41 | 0.2% |
| 3 | 38 | 0.2% |
| 3.375 | 38 | 0.2% |
| 3.625 | 37 | 0.2% |
| 4 | 37 | 0.2% |
| Other values (12918) | 20217 |
| Value | Count | Frequency (%) |
| 0.4999 | 12 | |
| 0.536 | 10 | |
| 0.5495 | 1 | < 0.1% |
| 0.6433 | 1 | < 0.1% |
| 0.6775 | 1 | < 0.1% |
| 0.6825 | 1 | < 0.1% |
| 0.6831 | 1 | < 0.1% |
| 0.696 | 1 | < 0.1% |
| 0.6991 | 1 | < 0.1% |
| 0.7007 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 15.0001 | 49 | |
| 15 | 2 | < 0.1% |
| 14.9009 | 1 | < 0.1% |
| 14.5833 | 1 | < 0.1% |
| 14.4219 | 1 | < 0.1% |
| 14.4113 | 1 | < 0.1% |
| 14.2959 | 1 | < 0.1% |
| 14.2867 | 1 | < 0.1% |
| 13.947 | 1 | < 0.1% |
| 13.8556 | 1 | < 0.1% |
| Distinct | 3842 |
|---|---|
| Distinct (%) | 18.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 206855.8169 |
| Minimum | 14999 |
|---|---|
| Maximum | 500001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 161.4 KiB |
Quantile statistics
| Minimum | 14999 |
|---|---|
| 5-th percentile | 66200 |
| Q1 | 119600 |
| median | 179700 |
| Q3 | 264725 |
| 95-th percentile | 489810 |
| Maximum | 500001 |
| Range | 485002 |
| Interquartile range (IQR) | 145125 |
Descriptive statistics
| Standard deviation | 115395.6159 |
|---|---|
| Coefficient of variation (CV) | 0.55785531 |
| Kurtosis | 0.3278702429 |
| Mean | 206855.8169 |
| Median Absolute Deviation (MAD) | 68400 |
| Skewness | 0.9777632739 |
| Sum | 4269504061 |
| Variance | 1.331614816 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 500001 | 965 | 4.7% |
| 137500 | 122 | 0.6% |
| 162500 | 117 | 0.6% |
| 112500 | 103 | 0.5% |
| 187500 | 93 | 0.5% |
| 225000 | 92 | 0.4% |
| 350000 | 79 | 0.4% |
| 87500 | 78 | 0.4% |
| 275000 | 65 | 0.3% |
| 150000 | 64 | 0.3% |
| Other values (3832) | 18862 |
| Value | Count | Frequency (%) |
| 14999 | 4 | |
| 17500 | 1 | < 0.1% |
| 22500 | 4 | |
| 25000 | 1 | < 0.1% |
| 26600 | 1 | < 0.1% |
| 26900 | 1 | < 0.1% |
| 27500 | 1 | < 0.1% |
| 28300 | 1 | < 0.1% |
| 30000 | 2 | |
| 32500 | 4 |
| Value | Count | Frequency (%) |
| 500001 | 965 | |
| 500000 | 27 | 0.1% |
| 499100 | 1 | < 0.1% |
| 499000 | 1 | < 0.1% |
| 498800 | 1 | < 0.1% |
| 498700 | 1 | < 0.1% |
| 498600 | 1 | < 0.1% |
| 498400 | 1 | < 0.1% |
| 497600 | 1 | < 0.1% |
| 497400 | 1 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.4 KiB |
| <1H OCEAN | |
|---|---|
| INLAND | |
| NEAR OCEAN | |
| NEAR BAY | |
| ISLAND | 5 |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.064922481 |
| Min length | 6 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | NEAR BAY |
|---|---|
| 2nd row | NEAR BAY |
| 3rd row | NEAR BAY |
| 4th row | NEAR BAY |
| 5th row | NEAR BAY |
Common Values
| Value | Count | Frequency (%) |
| <1H OCEAN | 9136 | |
| INLAND | 6551 | |
| NEAR OCEAN | 2658 | 12.9% |
| NEAR BAY | 2290 | 11.1% |
| ISLAND | 5 | < 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| ocean | 11794 | |
| 1h | 9136 | |
| inland | 6551 | |
| near | 4948 | |
| bay | 2290 | 6.6% |
| island | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.
First rows
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -122.23 | 37.88 | 41.0 | 880.0 | 129.0 | 322.0 | 126.0 | 8.3252 | 452600.0 | NEAR BAY |
| 1 | -122.22 | 37.86 | 21.0 | 7099.0 | 1106.0 | 2401.0 | 1138.0 | 8.3014 | 358500.0 | NEAR BAY |
| 2 | -122.24 | 37.85 | 52.0 | 1467.0 | 190.0 | 496.0 | 177.0 | 7.2574 | 352100.0 | NEAR BAY |
| 3 | -122.25 | 37.85 | 52.0 | 1274.0 | 235.0 | 558.0 | 219.0 | 5.6431 | 341300.0 | NEAR BAY |
| 4 | -122.25 | 37.85 | 52.0 | 1627.0 | 280.0 | 565.0 | 259.0 | 3.8462 | 342200.0 | NEAR BAY |
| 5 | -122.25 | 37.85 | 52.0 | 919.0 | 213.0 | 413.0 | 193.0 | 4.0368 | 269700.0 | NEAR BAY |
| 6 | -122.25 | 37.84 | 52.0 | 2535.0 | 489.0 | 1094.0 | 514.0 | 3.6591 | 299200.0 | NEAR BAY |
| 7 | -122.25 | 37.84 | 52.0 | 3104.0 | 687.0 | 1157.0 | 647.0 | 3.1200 | 241400.0 | NEAR BAY |
| 8 | -122.26 | 37.84 | 42.0 | 2555.0 | 665.0 | 1206.0 | 595.0 | 2.0804 | 226700.0 | NEAR BAY |
| 9 | -122.25 | 37.84 | 52.0 | 3549.0 | 707.0 | 1551.0 | 714.0 | 3.6912 | 261100.0 | NEAR BAY |
Last rows
| longitude | latitude | housing_median_age | total_rooms | total_bedrooms | population | households | median_income | median_house_value | ocean_proximity | |
|---|---|---|---|---|---|---|---|---|---|---|
| 20630 | -121.32 | 39.29 | 11.0 | 2640.0 | 505.0 | 1257.0 | 445.0 | 3.5673 | 112000.0 | INLAND |
| 20631 | -121.40 | 39.33 | 15.0 | 2655.0 | 493.0 | 1200.0 | 432.0 | 3.5179 | 107200.0 | INLAND |
| 20632 | -121.45 | 39.26 | 15.0 | 2319.0 | 416.0 | 1047.0 | 385.0 | 3.1250 | 115600.0 | INLAND |
| 20633 | -121.53 | 39.19 | 27.0 | 2080.0 | 412.0 | 1082.0 | 382.0 | 2.5495 | 98300.0 | INLAND |
| 20634 | -121.56 | 39.27 | 28.0 | 2332.0 | 395.0 | 1041.0 | 344.0 | 3.7125 | 116800.0 | INLAND |
| 20635 | -121.09 | 39.48 | 25.0 | 1665.0 | 374.0 | 845.0 | 330.0 | 1.5603 | 78100.0 | INLAND |
| 20636 | -121.21 | 39.49 | 18.0 | 697.0 | 150.0 | 356.0 | 114.0 | 2.5568 | 77100.0 | INLAND |
| 20637 | -121.22 | 39.43 | 17.0 | 2254.0 | 485.0 | 1007.0 | 433.0 | 1.7000 | 92300.0 | INLAND |
| 20638 | -121.32 | 39.43 | 18.0 | 1860.0 | 409.0 | 741.0 | 349.0 | 1.8672 | 84700.0 | INLAND |
| 20639 | -121.24 | 39.37 | 16.0 | 2785.0 | 616.0 | 1387.0 | 530.0 | 2.3886 | 89400.0 | INLAND |